COMPARATIVE ANALYSIS OF METHODS OF VECTORIZATION OF HIGH DIMENSIONAL TEXT DATA

نویسندگان

چکیده

The presented publication is devoted to an overview of the problem presenting textual informationfor subsequent implementation cluster analysis in framework processingand managing high-dimensional information. Modern requirements for analytical, search andrecommendation information systems demonstrate weak formation a holistic solution thatcan provide sufficient level speed and quality results obtained within ofthe current technology market. problementails need conduct objective existing solutions representing informationin vector space, order form view advantages disadvantages analyzed approaches, as well criteria that allow one implement theirown approach, devoid identified weaknesses. work allows youto get idea state elaboration limited subjectarea. Clustering text data automatic subsets, elements which are instancesof documents some researched, unstructured sample fixed dimension. This processcan be classified unsupervised learning, implies absence expert who personallyassigns class indices original documents. However, clusteranalysis without any pre-processing impossible. To do this, it necessary ensurestandardization reduction input single format form. Within ofthis stage analysis, discusses methodsfor preprocessing data. novelty lies thetheoretical basis main methods vectorization, by systematizing objectifyingthe proposed assumptions, conducting series experimental studies. difference from already published scientific works systematization modernsolutions, hypotheses about relevance effectiveness our own hybridizedapproach designed vectorization.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the impact of musical texts on the text recall of young learners of english in isfahan junior high schools

abstract although music possesses some kind of power and using it has been welcome by many students in language classrooms, it seems that they take a non-serious image of the lesson while listening to songs and they may think that it is a matter of fun. the main objective of the present study was to investigate whether learning a foreign language through musical texts (songs) can have an impac...

15 صفحه اول

a comparative pragmatic analysis of the speech act of “disagreement” across english and persian

the speech act of disagreement has been one of the speech acts that has received the least attention in the field of pragmatics. this study investigates the ways power relations, social distance, formality of the context, gender, and language proficiency (for efl learners) influence disagreement and politeness strategies. the participants of the study were 200 male and female native persian s...

15 صفحه اول

a comparative move analysis of the introduction sections of ma theses by iranian and native post-graduate students

since esp received universal attention to smooth the path for academic studies and productions, a great deal of research and studies have been directed towards this area. swales’ (1990) model of ra introduction move analysis has served a pioneering role of guiding many relevant studies and has proven to be productive in terms of helpful guidelines that are the outcome of voluminous productions ...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Izvestiâ ÛFU

سال: 2023

ISSN: ['1999-9429', '2311-3103']

DOI: https://doi.org/10.18522/2311-3103-2023-2-212-226